Litmus

Parallel autonomous ML research agents with a Director, git worktrees for per-agent experiment branches, a Skills library for validated technique reuse, a Sy...

MIT-0 · Free to use, modify, and redistribute. No attribution required.
1 · 27 · 0 current installs · 0 all-time installs
byKuber Mehta@kuberwastaken
MIT-0
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
medium confidence
Purpose & Capability
Name/description (parallel autonomous ML agents, git worktrees, skills library, synthesizer, Director) match the declared requirements and included scripts: uv, git, python3, cron setup, repo cloning, and per-agent worktrees under ~/.litmus. The optional CLAWRXIV_API_KEY relates to the ClawRxiv publishing feature and is consistent with the docs.
Instruction Scope
SKILL.md and scripts instruct the agent to clone the autoresearch harness, create a shared lab git repo and worktrees under ~/.litmus, install Python deps via uv, download ~1GB data, register cron jobs (via OpenClaw cron tool), spawn native subagents via sessions_spawn, and read/write structured state in ~/.litmus/shared/. All file and runtime actions are confined to the declared configPath (~/.litmus/) but the instructions do grant autonomous agents the ability to modify code in the shared repo and run experiments (i.e., execute arbitrary training code changes).
Install Mechanism
There is no packaged install spec; setup.sh clones GitHub (karpathy/autoresearch) and runs 'uv sync' to install Python dependencies. INSTALL.md references installing uv via a curl | sh from astral.sh (remote install script). Cloning from GitHub is expected for this purpose, but the remote 'curl | sh' pattern and uv installing packages are higher-risk operations — expected for a research harness but worth reviewing before running.
Credentials
The skill declares no required environment variables and only an optional CLAWRXIV_API_KEY for the ClawRxiv publishing integration. No unrelated credentials or excessive env requirements are requested. Runtime behavior reads/writes only to ~/.litmus/ paths declared in metadata.
Persistence & Privilege
always:false (normal). The skill instructs registering multiple cron jobs that schedule autonomous OpenClaw agent turns (Director, Synthesizer, Watchdog, etc.). Autonomous invocation and scheduled jobs are expected for this functionality but increase the blast radius — these scheduled jobs will run without further user interaction unless you choose not to register them.
Scan Findings in Context
[pre-scan-injection-none] expected: The pre-scan reported no injection signals. The repository does include network operations (git clone, curl for uv installer, optional curl to clawrxiv.io) which are expected for this kind of tool.
Assessment
This skill is internally consistent with its purpose, but it performs powerful local and autonomous operations. Before installing or running it: 1) Review the scripts (setup.sh, setup-cron.sh, prepare-agents.sh, and the referenced synthesizer/leisure scripts) yourself to ensure you understand what will run. 2) Run setup and experiments in an isolated environment (VM, container, or non-production machine) — the agents will clone code, install Python packages, create git branches, and run arbitrary training code on your machine. 3) Pay attention to the 'curl | sh' uv installation referenced in INSTALL.md and consider installing uv manually from a trusted source instead. 4) If you do not want remote publishing, disable ClawRxiv publishing (do not populate ~/.litmus/config.json with a clawrxiv api key and leave publishing flags off). 5) If you prefer to avoid autonomous scheduled runs, skip the cron registration step and run agent commands manually. 6) Verify you trust the upstream autoresearch code (karpathy/autoresearch) that will be cloned and executed. If any of these points worry you, do not run setup.sh on machines with sensitive data or shared user access.

Like a lobster shell, security has layers — review code before you run it.

Current versionv1.1.1
Download zip
autoresearchvk970pg30r2pc3wpb7xsdj5642983rz8platestvk971j6n7z5nd0696wskrzh6fjn83sh0xml-researchvk970pg30r2pc3wpb7xsdj5642983rz8p

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

Runtime requirements

🔬 Clawdis
OSLinux · macOS
Binsuv, git, python3

SKILL.md

Litmus — Parallel Autonomous ML Research Agents

Litmus spawns multiple OpenClaw subagents that experiment on your GPU overnight. Each runs on its own git branch in a shared lab repository — every experiment is a commit, agents can read each other's code, cherry-pick breakthroughs, and build on the global best at any time.

Validated techniques accumulate in a Skills library (~/.litmus/shared/skills/). A Synthesizer runs at 04:00 to distill collective knowledge into skills and write a research agenda for the next day. A Director runs every 2 hours to steer workers, trigger Compass Resets on stagnation, and orchestrate cross-agent knowledge transfer.

What makes it more than autoresearch:

  • Git worktrees: agents share one repo, each on their own branch — full experiment history, cherry-pick, and cross-agent code inspection via git -C ~/.litmus/repo log --all
  • Skills library: validated techniques persist and compound — agents don't re-discover wins
  • Synthesizer: distills all overnight notes into reusable skills and a research agenda
  • Compass Reset: Director detects stagnation and forces structured pivots using the skills gap
  • Two-phase experiment budget: quick 90-second check before committing to a full run
  • Structured attempt records: JSON per experiment in shared/attempts/ for rich analytics
  • Leisure mode (03:00–06:00): workers read papers, write moonshot hypotheses, identify gaps
  • Morning digest: research narrative delivered to your chat at 08:00

Everything is a native OpenClaw subagent. No external processes, no PID files.


First-Time Setup

Recommended — ask your OpenClaw agent (runs a guided onboarding conversation):

"Install https://clawhub.ai/kuberwastaken/litmus and set it up for my machine"

Full onboarding instructions: {baseDir}/references/onboarding.md — read that file first.

Or manually:

git clone https://github.com/kuberwastaken/litmus ~/.litmus
bash ~/.litmus/scripts/setup.sh

Clones Karpathy's training harness, builds the shared lab git repo at ~/.litmus/repo/, installs Python deps via uv, downloads ~1 GB of training data. Wait for it to finish.


Starting Research

1 — Prepare workspaces (creates git worktrees)

bash {baseDir}/scripts/prepare-agents.sh --agents 4 --templates architecture,optimizer,general,general

Creates git worktrees under ~/.litmus/agents/, each on its own branch in ~/.litmus/repo/. The shared lab git repo means every agent's experiments are immediately visible to all others:

git -C ~/.litmus/repo log --all --oneline --graph

2 — Spawn research subagents

sessions_spawn
  task: "Read program.md in your current directory and run the research loop forever."
  runtime: "subagent"
  mode: "session"
  agentId: "litmus-worker-arch-1"
  cwd: "~/.litmus/agents/arch-1"

Repeat for each agent, then:

sessions_yield message: "Research agents running. I'll notify you on new discoveries."

Templates: architecture · optimizer · regularization · general Full template details: {baseDir}/references/templates/

3 — Start the Director Layer

bash {baseDir}/scripts/setup-cron.sh --timezone "Your/Timezone"

Registers 6 cron jobs:

CronDefault scheduleRole
litmus-directorEvery 2h during research hoursReviews results, steers workers, Compass Reset on stagnation
litmus-leisure03:00 dailySwitches workers to paper-reading / creative thinking mode
litmus-synthesizer04:00 dailyDistills notes into skills library, writes research agenda
litmus-dawn06:00 dailyWakes workers, queues synthesizer's priority experiments
litmus-watchdogEvery 30 minLiveness check, escape mode on zero improvements
litmus-digest08:00 dailyMorning research narrative delivered to your chat

All times are configurable during onboarding — the setup agent pitches defaults and asks what you'd like to change. Common presets: night owl (01:00/02:00/04:00/07:00), early bird (23:00/00:30/02:00/05:30), intensive (1h director). Pass custom times to scripts/setup-cron.sh with --leisure-start, --synthesizer-time, --dawn-time, --digest-time, --director-hours, --watchdog-minutes.


Managing Agents

Status (experiment counts, best val_bpb, git tree):

bash {baseDir}/scripts/status.sh

Leaderboard (cross-agent, from shared/attempts/ JSON):

bash {baseDir}/scripts/results.sh --top 10
bash {baseDir}/scripts/results.sh --agent arch-1  # single agent

Full lab git history (all agents' experiments as a tree):

git -C ~/.litmus/repo log --all --oneline --graph

Inspect any experiment:

git -C ~/.litmus/repo show <commit-hash>  # see what changed
cat ~/.litmus/shared/attempts/<hash>.json  # see the metrics

Steer (redirect mid-run, no restart):

subagents action: "steer"  target: "litmus-worker-arch-1"
  message: "Stop refining depth. Checkout the best commit from opt-2 and combine their LR with DEPTH=10."

Stop:

subagents action: "kill"  target: "all"

What Agents Write Overnight

PathContents
~/.litmus/shared/attempts/<hash>.jsonStructured record for every experiment (agent, val_bpb, status, title)
~/.litmus/shared/skills/<name>.mdValidated reusable techniques with YAML frontmatter
~/.litmus/shared/notes/discoveries/Per-improvement discovery notes
~/.litmus/shared/notes/anomalies/Unexpected result notes
~/.litmus/shared/notes/moonshots/Speculative hypotheses from leisure
~/.litmus/shared/notes/synthesis/Synthesizer's research agenda and combination matrix
~/.litmus/shared/discoveries.mdCross-agent knowledge base (flat, for quick reading)
~/.litmus/shared/midnight-reflections.mdLeisure agent's nightly narrative
~/.litmus/repo/ (git)All experiment commits across all agents on their branches

Reference Files

  • {baseDir}/references/onboarding.md — first-time setup conversation
  • {baseDir}/references/program.md — worker agent loop (git-aware, skills-reading, two-phase budget)
  • {baseDir}/references/director.md — Director cron (Compass Reset, cross-pollination)
  • {baseDir}/references/leisure.md — Leisure mode (paper reading, structured notes, skill extraction)
  • {baseDir}/references/synthesizer.md — Synthesizer cron (knowledge distillation, skills library)
  • {baseDir}/references/dawn.md — Dawn cron (wake workers, queue experiments)
  • {baseDir}/references/watchdog.md — Watchdog cron (liveness, escape mode)
  • {baseDir}/references/digest.md — Morning digest (research narrative)
  • {baseDir}/references/templates/ — Research focus templates
  • {baseDir}/references/clawrxiv.md — ClawRxiv integration (optional auto-publishing)

Files

23 total
Select a file
Select a file to preview.

Comments

Loading comments…